A Cross Gender and Cross Lingual Study on Acoustic Features for Stress Recognition in Speech
نویسندگان
چکیده
We present a systematic study of the acoustic features for emotional stress in university students across gender and language groups. We design a common questionnaire of stress-inducing and non-stressinducing questions in Chinese and English, and interviewed 25 native speakers of Mandarin and 31 native speakers of English, of both gender. We extract 560 acoustic features including as low-level descriptors and Teager energy operator (TEO). Our acoustic feature-based classifier recognizes stress in the subjects’ speech with 81.28% accuracy on average within the same gender and language group, largely outperforming human perception tests which showed only 39.27%. Moreover, we show for the first time that whereas the emotion detection accuracy decreases by 28.18% across gender, our system maintains the same performance across Mandarin and English. Feature ranking experiments show that the most important stress features are TEO and MFCCs, rather than pitch. This explains the relative languageindependence of our model, even though Mandarin is a tonal language. TEO features are also founded to be insensitive to gender difference.
منابع مشابه
Classification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملContinuous local codebook features for multi- and cross-lingual acoustic phonetic modelling
In this paper we present a method for defining the question set for the induction of acoustic phonetic decision trees. The method is data driven resulting in a continuous feature space in contrast to the usual categorical one. We apply the features to a multilingual speech recognition task, outperforming consistently the standard method using IPA-based characteristics. An extension to cross-lin...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملCross Lingual Modelling Experiments for Indonesian
The extension of Large Vocabulary Continuous Speech Recognition (LVCSR) to resource poor languages such as Indonesian is hindered by the lack of transcribed acoustic data and appropriate pronunciation lexicons. Research has generally been directed toward establishing robust cross-lingual acoustic models, with the assumption that phonetic lexicons are readily available. This is not the case for ...
متن کاملCorrelation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کامل